Redefining Binarization and the Visual Archetype

نویسندگان

  • Anguelos Nicolaou
  • Marcus Liwicki
چکیده

Although binarization is considered passe, it still remains a highly popular research topic. In this paper we propose a rethinking of what binarization is. We introduce the notion of the visual archetype as the ideal form of any one document. Binarization can be defined as the restoration of the visual archetype for a class of images. This definition broadens the scope of what binarization means but also suggests ground-truth should focus on the foreground. A. Defining Binarization Binarization has almost become a dirty word in some parts of the Document Image Analysis (DIA) community. In the authors opinion the reason for this is not that it has been solved, not that it is an ill posed problem [1] although it is, as much as the fact that the assumption of high quality binarization is practically infeasible in many real world cases. Binarization is also usually required by methods that use smearing, morphological operations etc. Furthermore in a recent experimental evaluation of the contribution methods have across a full DIA pipeline by Laminroy et al. [2], a binarization method proved to be the one with the highest positive contribution. Some attempts at a definition of binarization have been made in the literature. Kavalieratou describes it as a method that discriminates foreground from background, thus, removing any kind of noise that obstructs the legibility of the document image [3]. Stathis et al. defined it as automatically converting the document images in a bi-level form in such way that the foreground information is represented by black pixels and the background by white ones in [4] and Shafait et al. give a very similar definition [5]. Shafait et al. define binarization Ntirogiannis et al. define it as ”the process that segments the document image into text and background by removing any existing degradations” [?] . What is more indicative, is It is indicative that several high impact papers addressing binarization as their main topic, describe it as selecting a threshold [7], or even wisely avoid defining at all [8]. In [1] Lopresti and Nagy use binarization as an example of an ill-defined problem. In recent years the DIBCO [9] competitions have become the standard for benchmarking binarization methods. By defining the problem with respect to a given ground-truth the discussion of what is binarization has been bypassed. Smith et al. in [10] address the question of bias in ground-truthing and propose monitoring the effect ground-truth has over method development. What is not addressed with benchmarking on specific datasets, is the extent to which a method needs tuning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Explore attitude mythical and the archetype to design cultural signs

Cultural signs are simple graphic forms. Signs compressed by the rich culture of nations. They are view form content facial and conceptual countries in small modle also they have reflection of reserves the archetype a nation. Archetypes are primary and stable pattern mentally. They are product collective experience over the thousands of years. Archetype represent itself in every human aspect, i...

متن کامل

Quad-pixel edge detection using neural network

One of the most fundamental features of digital image and the basic steps in image processing, analysis, pattern recognition and computer vision is the edge of an image where the preciseness and reliability of its results will affect directly on the comprehension machine system made objective world. Several edge detectors have been developed in the past decades, although no single edge detector...

متن کامل

Quad-pixel edge detection using neural network

One of the most fundamental features of digital image and the basic steps in image processing, analysis, pattern recognition and computer vision is the edge of an image where the preciseness and reliability of its results will affect directly on the comprehension machine system made objective world. Several edge detectors have been developed in the past decades, although no single edge detector...

متن کامل

Analysis of “Gol -o- Norooz” by Khwaju Kermani According to the Jungian Archetype of Individuation

Carl Gustav Jung, the   founder   of   the   analytical   psychology   in the twentieth century   believes    that   under   the   appearance   of   human   consciousness   exists   an eternal collective unconscious   which is   part   of   the   hereditary   psychological   factor   common in the entire human race. He successfully introduced   the common archetypes in the mythology of   the di...

متن کامل

Adaptive document binarization - a human vision approach

This paper presents a new approach to adaptive document binarization, inspired by the attributes of the Human Visual System (HVS). The proposed algorithm combines the characteristics of the OFF ganglion cells of the HVS with the classic Otsu binarization technique. Ganglion cells with four receptive field sizes tuned to different spatial frequencies are employed, which, adopting a new activatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1609.09451  شماره 

صفحات  -

تاریخ انتشار 2016